Introduce utils.simulate() #348

vallis · 2023-05-17T01:55:20Z

simulate will draw random residuals for all pulsars in pta by sampling all white-noise and GP objects for parameters params.

Requires GPs to have combine=False, and will run faster with GP ECORR.
If pta includes a TimingModel, that should be created with a small prior_variance. Note that the variance applies to the entire design-matrix vector, and should be multiplied by len(psr.toas) if it is understood as the variance per individual residual.
This function can be used with utils.set_residuals to replace residuals in a Pulsar object.
This function can also be used with parameter.sample(pta.params) to sample from the prior.
Note that any PTA built from that Pulsar may nevertheless cache residuals internally, so it is safer to rebuild the PTA with the modified Pulsar.

…ocate utils.KernelMatrix

codecov · 2023-05-17T02:17:35Z

Codecov Report

Attention: 25 lines in your changes are missing coverage. Please review.

Comparison is base (c49646c) 88.37% compared to head (ab7ef2b) 87.97%.
Report is 54 commits behind head on dev.

❗ Current head ab7ef2b differs from pull request most recent head a98e6c7. Consider uploading reports for the commit a98e6c7 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##              dev     #348      +/-   ##
==========================================
- Coverage   88.37%   87.97%   -0.41%     
==========================================
  Files          13       13              
  Lines        3012     3051      +39     
==========================================
+ Hits         2662     2684      +22     
- Misses        350      367      +17

Files	Coverage Δ
enterprise/signals/gp_signals.py	`90.68% <100.00%> (ø)`
enterprise/signals/signal_base.py	`89.84% <86.44%> (-0.27%)`	⬇️
enterprise/signals/utils.py	`83.94% <59.52%> (-2.52%)`	⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c49646c...a98e6c7. Read the comment docs.

vhaasteren · 2023-05-18T06:11:30Z

I have a semi-related question because you are introducing prior_variance in TimingModel. I have also run into the need to use values for the prior_variance for the timing model parameters. Either for simulating data, but I also need it for analysis purposes. Problem is that the prior on timing model parameters is highly specific for each timing model parameter, and we have different parameters per pulsar. I ran into this challenge also when writing code to solve pulsars

What I don't like about using prior_variance in this particular way is that it is not flexible enough (I think, it's hard to see what BasisGP really can accept for priorFunction) to tailor the prior per timing model parameter per pulsar. I have an application in mind that requires that, so I have an interest in getting it right from the start.

In my previous work, my solution was to use the information in the par-files for this. The uncertainty column in the par-file is currently only used as output, even though libstempo and PINT do read it in. Actually, I'm not sure whether PINT can use it for anything else. But it's a perfect place in the par-file to write the prior width as input. I mean, the parameter values in the par-file are in reality also just the means of the prior in the analysis. People just haven't been thinking about it like that, but that's what it is. I'd also love to have a discussion about this with the timers. Ideally the timing group(s) would create data releases with more physical priors included on all parameters.

So my suggestion is to use the uncertainty column in the par-file to specify the prior width.

vhaasteren · 2023-05-18T10:41:57Z

Two more questions:

Regarding caching of residuals. When doing non-linear timing models, you can't cache the residuals like normal either. Those come from the timing package when you change parameters. How (and where?) is that implemented? Somehow we should be able to tell the caching descriptor that the residuals have changed and related quantities like TNr should be recalculated.
Will it still be faster with GP ECORR if you use my FastShermanMorrison code? I saw somewhere in here that it depends on the rNr solve of ShermanMorrison, which is super slow

vhaasteren

In general looks ok. I provided minor code feedback.

Question:

In the unit tests, right now the returned residuals are not actually checked, which makes the tests incomplete. Ideally we would generate an ensemble of realizations and compare that to the covariance matrix. That would be most realistic. It also sounds less feasible.

Also, my question regarding the timing model parameter priors still stand. IMO those should actually come from the timing package. I don't want to make that a show-stopper though.

Any thoughts?

vhaasteren · 2023-11-10T14:07:20Z

enterprise/signals/signal_base.py

+                inv = sl.cho_solve(cf, np.identity(cf[0].shape[0]))
+                if logdet:
+                    ld = 2.0 * np.sum(np.log(np.diag(cf[0])))
+            except np.linalg.LinAlgError:


Add a # pragma: no cover here?

vhaasteren · 2023-11-10T14:07:51Z

enterprise/signals/utils.py

+                gpresiduals.append(0)
+            elif phi.ndim == 1:
+                gpresiduals.append(np.dot(fmat, np.sqrt(phi) * np.random.randn(phi.shape[0])))
+            else:


Add a # pragma: no cover here

vhaasteren · 2023-11-10T14:08:49Z

enterprise/signals/utils.py

+    for delay, ndiag in zip(delays, ndiags):
+        if ndiag is None:
+            whiteresiduals.append(0)
+        elif isinstance(ndiag, ShermanMorrison):


Not a good idea to check for an instance of ShermanMorrison. When 'fastshermanmorrison' is used, this will be a different type. Instead, perhaps duck typing can be used, like:

if all(hasattr(ndiag, attr) for attr in ['_nvec', '_jvec', '_slices']):

vhaasteren · 2023-11-10T14:10:28Z

enterprise/signals/utils.py

+            cf = cholesky(sps.csc_matrix(phis))
+            gp = np.zeros(phis.shape[0])
+            gp[cf.P()] = np.dot(cf.L().toarray(), np.random.randn(phis.shape[0]))
+        else:


Add a #pragma: no cover

vallis added 3 commits May 16, 2023 18:06

Just some comments

994382f

Merging from master

3ad5205

Add utils.simulate, give parameter prior_variance to TimingModel, rel…

ab7ef2b

…ocate utils.KernelMatrix

vhaasteren changed the base branch from master to dev November 7, 2023 09:00

vhaasteren self-requested a review November 7, 2023 09:07

vhaasteren added the enhancement label Nov 7, 2023

vhaasteren requested changes Nov 10, 2023

View reviewed changes

Merge branch 'master' of https://github.com/nanograv/enterprise

a98e6c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce utils.simulate() #348

Introduce utils.simulate() #348

vallis commented May 17, 2023

codecov bot commented May 17, 2023 •

edited

Loading

vhaasteren commented May 18, 2023 •

edited

Loading

vhaasteren commented May 18, 2023

vhaasteren left a comment

vhaasteren Nov 10, 2023

vhaasteren Nov 10, 2023

vhaasteren Nov 10, 2023

vhaasteren Nov 10, 2023

Introduce utils.simulate() #348

Are you sure you want to change the base?

Introduce utils.simulate() #348

Conversation

vallis commented May 17, 2023

codecov bot commented May 17, 2023 • edited Loading

Codecov Report

vhaasteren commented May 18, 2023 • edited Loading

vhaasteren commented May 18, 2023

vhaasteren left a comment

Choose a reason for hiding this comment

vhaasteren Nov 10, 2023

Choose a reason for hiding this comment

vhaasteren Nov 10, 2023

Choose a reason for hiding this comment

vhaasteren Nov 10, 2023

Choose a reason for hiding this comment

vhaasteren Nov 10, 2023

Choose a reason for hiding this comment

codecov bot commented May 17, 2023 •

edited

Loading

vhaasteren commented May 18, 2023 •

edited

Loading